Cognitive problem isolation in quick provision fault analysis

ABSTRACT

An approach is provided in which a set of provision information is generated from a set of provisioners that are in process of fulfilling a client&#39;s provision request. The approach creates a set of provision events based on the set of provision information and, in response to detecting a failure of the provision request, the approach generates a provision chain from the set of provision events. The provision chain links the set of provision events based on correlation rules and identifies at least one isolation point of the failure. The approach informs the client of the at least one isolation point of the failure identified in the provision chain.

BACKGROUND

Provisioning is the allocation of resources and services to a user.Cloud provisioning is based on procedures that specify how a userprocures cloud services and resources from a cloud provider. Cloudservices include infrastructure as a service, software as a service, andplatform as a service. Cloud resources include processing resources,network resources, storage resources, and etcetera.

In current cloud architectures, a service provisioner includes multiplesub provisioners that support different types of activities. Identityand Access Management (IAM) is a framework of policies and technologiesthat ensure users in an enterprise have appropriate access to technologyresources (e.g., an authentication manager for each provisionoperation). When a provision request reaches the service provisioner, aseries of provision operations in the service provisioner occurs. Forexample, the service provisioner may trigger a provision operation basedon the provision request, the service provisioner may trigger an IAMrequest for authentication, and/or a provision operation is triggered byother operations.

BRIEF SUMMARY

According to one embodiment of the present disclosure, an approach isprovided in which a set of provision information is generated from a setof provisioners that are in process of fulfilling a client's provisionrequest. The approach creates a set of provision events based on the setof provision information and, in response to detecting a failure of theprovision request, the approach generates a provision chain from the setof provision events. The provision chain links the set of provisionevents based on correlation rules and identifies at least one isolationpoint of the failure. The approach informs the client of the at leastone isolation point of the failure identified in the provision chain.

The foregoing is a summary and thus contains, by necessity,simplifications, generalizations, and omissions of detail; consequently,those skilled in the art will appreciate that the summary isillustrative only and is not intended to be in any way limiting. Otheraspects, inventive features, and advantages of the present disclosure,as defined solely by the claims, will become apparent in thenon-limiting detailed description set forth below. dr

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

The present disclosure may be better understood, and its numerousobjects, features, and advantages made apparent to those skilled in theart by referencing the accompanying drawings, wherein:

FIG. 1 is a block diagram of a data processing system in which themethods described herein can be implemented;

FIG. 2 provides an extension of the information handling systemenvironment shown in FIG. 1 to illustrate that the methods describedherein can be performed on a wide variety of information handlingsystems which operate in a networked environment;

FIG. 3 is an exemplary diagram depicting a performance monitorgenerating provision chains and using the provision chains to identifyroot causes of provision request failures;

FIG. 4 is an exemplary diagram depicting details of information capturemodule plugins;

FIG. 5 is an exemplary diagram depicting provisioning event dataaggregated into a provisioning events queue;

FIG. 6 is an exemplary diagram depicting a fault analysis modulegenerating a provision chain from provision events;

FIG. 7 is an exemplary flowchart depicting steps taken to captureprovision event information and identify isolation points of failuresbased on generated provision chains;

FIG. 8 is an exemplary flowchart depicting steps taken to generate aprovision chain based on provision events;

FIG. 9 is an exemplary diagram depicting a detailed provision chainresult; and

FIG. 10 is an exemplary diagram depicting a refined provision chain.

DETAILED DESCRIPTION

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting of the disclosure.As used herein, the singular forms “a”, “an” and “the” are intended toinclude the plural forms as well, unless the context clearly indicatesotherwise. It will be further understood that the terms “comprises”and/or “comprising,” when used in this specification, specify thepresence of stated features, integers, steps, operations, elements,and/or components, but do not preclude the presence or addition of oneor more other features, integers, steps, operations, elements,components, and/or groups thereof.

The corresponding structures, materials, acts, and equivalents of allmeans or step plus function elements in the claims below are intended toinclude any structure, material, or act for performing the function incombination with other claimed elements as specifically claimed. Thedescription of the present disclosure has been presented for purposes ofillustration and description, but is not intended to be exhaustive orlimited to the disclosure in the form disclosed. Many modifications andvariations will be apparent to those of ordinary skill in the artwithout departing from the scope and spirit of the disclosure. Theembodiment was chosen and described in order to best explain theprinciples of the disclosure and the practical application, and toenable others of ordinary skill in the art to understand the disclosurefor various embodiments with various modifications as are suited to theparticular use contemplated.

The present invention may be a system, a method, and/or a computerprogram product at any possible technical detail level of integration.The computer program product may include a computer readable storagemedium (or media) having computer readable program instructions thereonfor causing a processor to carry out aspects of the present invention.

The computer readable storage medium can be a tangible device that canretain and store instructions for use by an instruction executiondevice. The computer readable storage medium may be, for example, but isnot limited to, an electronic storage device, a magnetic storage device,an optical storage device, an electromagnetic storage device, asemiconductor storage device, or any suitable combination of theforegoing. A non-exhaustive list of more specific examples of thecomputer readable storage medium includes the following: a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), a static random access memory (SRAM), a portablecompact disc read-only memory (CD-ROM), a digital versatile disk (DVD),a memory stick, a floppy disk, a mechanically encoded device such aspunch-cards or raised structures in a groove having instructionsrecorded thereon, and any suitable combination of the foregoing. Acomputer readable storage medium, as used herein, is not to be construedas being transitory signals per se, such as radio waves or other freelypropagating electromagnetic waves, electromagnetic waves propagatingthrough a waveguide or other transmission media (e.g., light pulsespassing through a fiber-optic cable), or electrical signals transmittedthrough a wire.

Computer readable program instructions described herein can bedownloaded to respective computing/processing devices from a computerreadable storage medium or to an external computer or external storagedevice via a network, for example, the Internet, a local area network, awide area network and/or a wireless network. The network may comprisecopper transmission cables, optical transmission fibers, wirelesstransmission, routers, firewalls, switches, gateway computers and/oredge servers. A network adapter card or network interface in eachcomputing/processing device receives computer readable programinstructions from the network and forwards the computer readable programinstructions for storage in a computer readable storage medium withinthe respective computing/processing device.

Computer readable program instructions for carrying out operations ofthe present invention may be assembler instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, configuration data for integrated circuitry, oreither source code or object code written in any combination of one ormore programming languages, including an object oriented programminglanguage such as Smalltalk, C++, or the like, and procedural programminglanguages, such as the “C” programming language or similar programminglanguages. The computer readable program instructions may executeentirely on the user's computer, partly on the user's computer, as astand-alone software package, partly on the user's computer and partlyon a remote computer or entirely on the remote computer or server. Inthe latter scenario, the remote computer may be connected to the user'scomputer through any type of network, including a local area network(LAN) or a wide area network (WAN), or the connection may be made to anexternal computer (for example, through the Internet using an InternetService Provider). In some embodiments, electronic circuitry including,for example, programmable logic circuitry, field-programmable gatearrays (FPGA), or programmable logic arrays (PLA) may execute thecomputer readable program instructions by utilizing state information ofthe computer readable program instructions to personalize the electroniccircuitry, in order to perform aspects of the present invention.

Aspects of the present invention are described herein with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems), and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer readable program instructions.

These computer readable program instructions may be provided to aprocessor of a computer, or other programmable data processing apparatusto produce a machine, such that the instructions, which execute via theprocessor of the computer or other programmable data processingapparatus, create means for implementing the functions/acts specified inthe flowchart and/or block diagram block or blocks. These computerreadable program instructions may also be stored in a computer readablestorage medium that can direct a computer, a programmable dataprocessing apparatus, and/or other devices to function in a particularmanner, such that the computer readable storage medium havinginstructions stored therein comprises an article of manufactureincluding instructions which implement aspects of the function/actspecified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto acomputer, other programmable data processing apparatus, or other deviceto cause a series of operational steps to be performed on the computer,other programmable apparatus or other device to produce a computerimplemented process, such that the instructions which execute on thecomputer, other programmable apparatus, or other device implement thefunctions/acts specified in the flowchart and/or block diagram block orblocks.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods, and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof instructions, which comprises one or more executable instructions forimplementing the specified logical function(s). In some alternativeimplementations, the functions noted in the blocks may occur out of theorder noted in the Figures. For example, two blocks shown in successionmay, in fact, be accomplished as one step, executed concurrently,substantially concurrently, in a partially or wholly temporallyoverlapping manner, or the blocks may sometimes be executed in thereverse order, depending upon the functionality involved. It will alsobe noted that each block of the block diagrams and/or flowchartillustration, and combinations of blocks in the block diagrams and/orflowchart illustration, can be implemented by special purposehardware-based systems that perform the specified functions or acts orcarry out combinations of special purpose hardware and computerinstructions. The following detailed description will generally followthe summary of the disclosure, as set forth above, further explainingand expanding the definitions of the various aspects and embodiments ofthe disclosure as necessary.

FIG. 1 illustrates information handling system 100, which is asimplified example of a computer system capable of performing thecomputing operations described herein. Information handling system 100includes one or more processors 110 coupled to processor interface bus112. Processor interface bus 112 connects processors 110 to Northbridge115, which is also known as the Memory Controller Hub (MCH). Northbridge115 connects to system memory 120 and provides a means for processor(s)110 to access the system memory. Graphics controller 125 also connectsto Northbridge 115. In one embodiment, Peripheral Component Interconnect(PCI) Express bus 118 connects Northbridge 115 to graphics controller125. Graphics controller 125 connects to display device 130, such as acomputer monitor.

Northbridge 115 and Southbridge 135 connect to each other using bus 119.In some embodiments, the bus is a Direct Media Interface (DMI) bus thattransfers data at high speeds in each direction between Northbridge 115and Southbridge 135. In some embodiments, a PCI bus connects theNorthbridge and the Southbridge. Southbridge 135, also known as theInput/Output (I/O) Controller Hub (ICH) is a chip that generallyimplements capabilities that operate at slower speeds than thecapabilities provided by the Northbridge. Southbridge 135 typicallyprovides various busses used to connect various components. These bussesinclude, for example, PCI and PCI Express busses, an ISA bus, a SystemManagement Bus (SMBus or SMB), and/or a Low Pin Count (LPC) bus. The LPCbus often connects low-bandwidth devices, such as boot ROM 196 and“legacy” I/O devices (using a “super I/O” chip). The “legacy” I/Odevices (198) can include, for example, serial and parallel ports,keyboard, mouse, and/or a floppy disk controller. Other components oftenincluded in Southbridge 135 include a Direct Memory Access (DMA)controller, a Programmable Interrupt Controller (PIC), and a storagedevice controller, which connects Southbridge 135 to nonvolatile storagedevice 185, such as a hard disk drive, using bus 184.

ExpressCard 155 is a slot that connects hot-pluggable devices to theinformation handling system. ExpressCard 155 supports both PCI Expressand Universal Serial Bus (USB) connectivity as it connects toSouthbridge 135 using both the USB and the PCI Express bus. Southbridge135 includes USB Controller 140 that provides USB connectivity todevices that connect to the USB. These devices include webcam (camera)150, infrared (IR) receiver 148, keyboard and trackpad 144, andBluetooth device 146, which provides for wireless personal area networks(PANs). USB Controller 140 also provides USB connectivity to othermiscellaneous USB connected devices 142, such as a mouse, removablenonvolatile storage device 145, modems, network cards, IntegratedServices Digital Network (ISDN) connectors, fax, printers, USB hubs, andmany other types of USB connected devices. While removable nonvolatilestorage device 145 is shown as a USB-connected device, removablenonvolatile storage device 145 could be connected using a differentinterface, such as a Firewire interface, etcetera.

Wireless Local Area Network (LAN) device 175 connects to Southbridge 135via the PCI or PCI Express bus 172. LAN device 175 typically implementsone of the Institute of Electrical and Electronic Engineers (IEEE)802.11 standards of over-the-air modulation techniques that all use thesame protocol to wirelessly communicate between information handlingsystem 100 and another computer system or device. Optical storage device190 connects to Southbridge 135 using Serial Analog Telephone Adapter(ATA) (SATA) bus 188. Serial ATA adapters and devices communicate over ahigh-speed serial link. The Serial ATA bus also connects Southbridge 135to other forms of storage devices, such as hard disk drives. Audiocircuitry 160, such as a sound card, connects to Southbridge 135 via bus158. Audio circuitry 160 also provides functionality associated withaudio hardware such as audio line-in and optical digital audio in port162, optical digital output and headphone jack 164, internal speakers166, and internal microphone 168. Ethernet controller 170 connects toSouthbridge 135 using a bus, such as the PCI or PCI Express bus.Ethernet controller 170 connects information handling system 100 to acomputer network, such as a Local Area Network (LAN), the Internet, andother public and private computer networks.

While FIG. 1 shows one information handling system, an informationhandling system may take many forms. For example, an informationhandling system may take the form of a desktop, server, portable,laptop, notebook, or other form factor computer or data processingsystem. In addition, an information handling system may take other formfactors such as a personal digital assistant (PDA), a gaming device,Automated Teller Machine (ATM), a portable telephone device, acommunication device or other devices that include a processor andmemory.

FIG. 2 provides an extension of the information handling systemenvironment shown in FIG. 1 to illustrate that the methods describedherein can be performed on a wide variety of information handlingsystems that operate in a networked environment. Types of informationhandling systems range from small handheld devices, such as handheldcomputer/mobile telephone 210 to large mainframe systems, such asmainframe computer 270. Examples of handheld computer 210 includepersonal digital assistants (PDAs), personal entertainment devices, suchas Moving Picture Experts Group Layer-3 Audio (MP3) players, portabletelevisions, and compact disc players. Other examples of informationhandling systems include pen, or tablet, computer 220, laptop, ornotebook, computer 230, workstation 240, personal computer system 250,and server 260. Other types of information handling systems that are notindividually shown in FIG. 2 are represented by information handlingsystem 280. As shown, the various information handling systems can benetworked together using computer network 200. Types of computer networkthat can be used to interconnect the various information handlingsystems include Local Area Networks (LANs), Wireless Local Area Networks(WLANs), the Internet, the Public Switched Telephone Network (PSTN),other wireless networks, and any other network topology that can be usedto interconnect the information handling systems. Many of theinformation handling systems include nonvolatile data stores, such ashard drives and/or nonvolatile memory. The embodiment of the informationhandling system shown in FIG. 2 includes separate nonvolatile datastores (more specifically, server 260 utilizes nonvolatile data store265, mainframe computer 270 utilizes nonvolatile data store 275, andinformation handling system 280 utilizes nonvolatile data store 285).The nonvolatile data store can be a component that is external to thevarious information handling systems or can be internal to one of theinformation handling systems. In addition, removable nonvolatile storagedevice 145 can be shared among two or more information handling systemsusing various techniques, such as connecting the removable nonvolatilestorage device 145 to a USB port or other connector of the informationhandling systems.

As discussed above, when a provision request reaches a serviceprovisioner, a series of provision operations in the service provisioneroccurs. A challenge found, however, is that multiple provision requestsoccur in parallel and relationships between operations in the sameprovision request are uncertain except for directly triggeredassociations. As a result, when a provision failure occurs, cloudarchitectures have difficulty in isolating the root cause of theprovision failure.

FIGS. 3 through 9 depict an approach that can be executed on aninformation handling system that provides quick provision fault analysisand cognitive problem isolation. The approach introduces a method ofproblem isolation by generating provision chains and identifying aprovision chain with abnormal status (e.g., provision failure). Togenerate the provision chain, the information handling system usesplugins to collect information from provisioners while fulfillingprovision requests and create provision events from the collectedinformation. The information handling system then uses correlation rulesto select events of a provision operation, analyze the events forcorrelation between different provisioners, and aggregate the events ofthe provision between the same provisioner into a provision chain. Inturn, the information handling system uses the provision chain toidentify isolation points of the provision request failure and inform aclient of the isolation points.

FIG. 3 is an exemplary diagram depicting a performance monitorgenerating provision chains and using the provision chains to identifyroot causes of provision request failures.

Client 300 sends instances requests 302 to service provisioner 304executing in cloud 306. Service provisioner 304 interacts withdependency services 308 to fulfill instances requests 302. Dependencyservices 308 includes identity access management (IAM) framework 310,which interacts with, in one embodiment, kubernetes provisioner 312,virtual machine provisioner 314, storage provisioner 316, and networkprovisioner 318. In turn, dependency services 308 generate instances 320for client 300 to utilize.

During the provision request fulfillment process, information capturemodule plugins 322 capture information pertaining to the provisioningprocess and send the information to provision events queue 334 withinperformance monitor 340. In one embodiment, information capture moduleplugins 322 capture the provision information from each of provisioners312 through 318, such as requester id, provisioner name, instance id,timestamp, serving instance id, and status. Information capture moduleplugins 322 also capture authorization information from IAM framework310, such as requester id, resource id, timestamp, serving instance id,and status. In turn, information capture module plugins 322 generateevent data based on the captured information and send the event data toprovision events queue 334 where provision events queue 334 persists allof the provision events accordingly (see FIGS. 4, 5, and correspondingtext for further details).

When client 300 detects a provision request failure, client 300 sendsfault analysis request 324 to service provisioner 304. In oneembodiment, service provisioner 304 detects the provision requestfailure. Service provisioner 304 then sends fault analysis request 326to fault analysis module 328 to begin the fault analysis process.

Correlation engine 330 retrieves the provision event information fromprovision events queue 334 and generates provision chains 332 based pm aset pf correlation rules (see FIGS. 7-10 and corresponding text forfurther details). In turn, fault analysis module 328 sends results 338to client 300 that indicates which provisioner had a problem to breakthe provision chain.

In one embodiment, provision events queue 334 sends information tooptimization engine 336. Optimization engine 336, after running for sometime, checks whether removing a parameter for a type of certaincorrelation will cause error outputs. If no conflict exists,optimization engine 336 removes the parameter in both informationcapture module plugins 322 and correlation engine 330 to optimize theperformance of the event capture and analysis process.

FIG. 4 is an exemplary diagram depicting details of information capturemodule plugins 322. Information capture module plugins 322 include IAMinformation capture plugin 400 and information capture plugins 410, 420,430, and 440. IAM information capture plugin 400 captures informationpertaining to IAM framework 310, and information capture plugins 410-440capture information pertaining to their respective resource provisioners312-318.

Each of plugins 400-440 captures the required information, forms thecaptured information into provision event data, and sends the provisionevent data to provision events queue 334 (see FIG. 5 and correspondingtext for further details).

In one embodiment, plugins 400-440 include two types of pluginimplementations. A first type of plugin is an HTTP proxy. The HTTP proxyparses http headers and captures information to generate provisionevents. A second type of plugin is a library that builds inside aprovisioner and IAM framework to provide an interface to generateprovision events. This plugin type combines raw data of the provisionevents into data transform records (e.g., JSON) that are sent toprovision events queue 334.

FIG. 5 is an exemplary diagram depicting provisioning event dataaggregated into a provisioning events queue. Information capture plugin410 (along with plugins 420-440) send event data 500 to provision eventsqueue 334. Likewise, IAM information capture plugin 400 sends event data510 to provision events queue 334.

Provision events queue 334 aggregates the event data into two sets ofinformation, which are correlation and aggregation information (columns520, 530, 540, 550, and 560) and provision properties (columns 570, 580,and 590). The correlation and aggregation information include requesterID column 520, which includes IDs of end users who trigger the provisionrequest. Provisioner name column 530 includes names of provisioners inIAM framework 310 that are also abstracted to be a provisioner. InstanceID column 540 includes IDs of instances that are provisioned by aprovisioner. Serving instance ID column 550 includes IDs that triggerprovisions to occur. In one embodiment a service instance provision mayprovision several lower layer service instances as part of components.For example, kubernetes provisioner 312 may need to provision virtualmachine (VM) provisioner 314 and storage provisioner 316 asdependencies. Next provisioner column 560 includes names of a nextprovisioner of the provision request.

The provision properties information includes timestamp column 570 thatincludes timestamps of when provision events are generated. Statuscolumn 580 includes flags that indicate whether the provision starts orcompletes. Other information column 590 includes other user requiredinformation.

FIG. 6 is an exemplary diagram depicting a fault analysis modulegenerating a provision chain from provision events.

Correlation engine 330 includes events selection 600, next events typeclassification 610, correlation checking 620, and correlation &aggregation 630. In response to receiving fault analysis request 326,events selection 600 selects each of the events from provision eventsqueue 334 related to user entry points (e.g., components that are firstaccessed). Next events type classification 610 checks the next type ofthe event in provisioner from the events. Correlation rule checking 620checks if there are events that fulfill a correlation rule. In oneembodiment, the correlation rule default is “Requesterid=selected.request id;” “Serving instance id=selected. instance id.”

If a correlation rule is fulfilled, then correlation checking 620interacts with events selection 600 to select all the events fromprovision events queue 334. If the correlation rule is not fulfilled,then correlation checking 620 interacts with correlation and aggregation630 to generate a link between events that fulfill the correlation andaggregate the events that fulfill a criterion, such as “Requester id &same Provisioner & same instance id” (see FIG. 8 and corresponding textfor further details).

Correlation and aggregation 630 generates links between each of the theevents that fulfill the correlation. Correlation and aggregation 630aggregates each if the events that fulfill the same ‘Requester id & sameProvisioner & same instance id’ into the same node and rewrites theirstatus as: ‘start->completed->failed.’ In turn, correlation andaggregation 630 generates provision chain 332 that identifies isolationpoints of failure (see FIGS. 9, 10, and corresponding text for furtherdetails).

FIG. 7 is an exemplary flowchart depicting steps taken to captureprovision event information and identify isolation points of failuresbased on generated provision chains. FIG. 7 processing commences at 700whereupon, at step 710, the process captures provision information fromeach of provisioners 312-318 and captures authorization information fromIAM framework 310 during provision request fulfillments.

At step 720, the process generates event data based on the capturedinformation and stores the events in provision events queue 334 (seeFIG. 5 and corresponding text for further details). The processdetermines as to whether the process receives a fault analysis request324 (decision 730). If the process did not received a fault analysisrequest 324, then decision 730 branches to the ‘no’ branch which loopsback to continue to capture provision information. This loopingcontinues until the process receives a fault analysis request 324, atwhich point decision 730 branches to the ‘yes’ branch exiting the loop.

At step 740, the process (e.g., services provisioner 304) sends faultanalysis request 326 to fault analysis module 328. At predefined process750, the process performs correlation analysis and provision chaingeneration steps (see FIG. 8 and corresponding text for processingdetails). At step 760, the process sends fault analysis results 338 toclient 300 based on the generated provision chains. In one embodiment,the process sends a detailed provision chain such as the example shownin FIG. 9. In another embodiment, the process sends a refined provisionchain such as the example shown in FIG. 10.

At step 770, in one embodiment, the process uses optimization engine 336to analyze the provision event data and identify ways in which tominimize future information collected for a certain user, whichoptimization engine 336 applies to the information capture moduleplugins. In addition, at step 780, the process updates correlation rulesin correlation engine 330 according to the optimization results. FIG. 7processing thereafter ends at 795.

FIG. 8 is an exemplary flowchart depicting steps taken to generate aprovision chain based on provision events. FIG. 8 processing commencesat 800 whereupon, at step 810, the process identifies entry pointscorresponding to user requests and selects events related to theidentified entry points.

At step 820, the process identifies a next provisioner corresponding tothe provision events (column 560 in FIG. 5) and, at step 830, theprocess checks if there are events that fulfill a correlation rule. Forexample, if the next provisioner is identified as IAM framework 310,only the type IAM framework 310 events are selected for the next check.

The process determines as to whether a correlation rule is fulfilled(decision 840). If a correlation rule is fulfilled, then decision 840branches to the ‘yes’ branch which loops back to select the events thatfulfill the correlation rule and generate links between the previousevents and the selected events (step 850). Then, for each of theselected events, the process repeats the step to check the next link(step 820). This looping continues until there are no more events thatfulfill a correlation rule, at which point decision 840 branches to the‘no’ branch exiting the loop.

At step 860, in one embodiment, the process aggregates each of theevents that fulfill same “Requester id & same Provisioner & sameinstance id” combination and rewrites the status asstart→completed→failed. At step 870, the process analyzes thecorrelation and aggregation of each selected event and generates aprovision chain accordingly (see FIGS. 9, 10, and corresponding text forfurther details). FIG. 8 processing thereafter returns to the callingroutine (see FIG. 7) at 895.

FIG. 9 is an exemplary diagram depicting a detailed provision chainresult. Provision chain 900 includes (i) details of each provisionevent; (ii) links between various provision events based on correlationrules discussed herein (905, 910, 915, 920, 925, 930, 935, and 940);(iii) provision event aggregations (dashed lines) and (iv) isolationpoints of failure (950, 955, 960, and 965) within various provisionevents. In one embodiment, the approach discussed herein aggregatesprovision chain 900 into an easily viewable provision chain such as thatshown in FIG. 10.

FIG. 10 is an exemplary diagram depicting a refined provision chain.Fault analysis module 328 aggregates all the provision events thatfulfill the same “Requester ID, Provisioner, instance ID” and generatesa simplified provision chain topology 1000. Topology 1000 includesservice provisioner icon 1010, IAM icon 1020, kubernetes provisionericon 1030, IAM icon 1040, VM1 provisioner icon 1050, storage provisionericon 1060, and network provisioner icon 1070. In addition, topology 1000includes links 905-940 that match links 905-940 shown in FIG. 9.

In the embodiment shown in FIG. 10, topology 1000 includes “!” withinprovisioner icons that correspond to isolation points of failure 950,955, 960, and 965 shown in FIG. 9. Topology 1000 shows that the failureisolation points lie in a network provisioning from network provisioner318 initiated by VM1 provisioner 314. As a result, a user focusesefforts on correcting the provisioning request failure instead ofanalyzing each provision event generated from the provision request.

While particular embodiments of the present disclosure have been shownand described, it will be obvious to those skilled in the art that,based upon the teachings herein, that changes and modifications may bemade without departing from this disclosure and its broader aspects.Therefore, the appended claims are to encompass within their scope allsuch changes and modifications as are within the true spirit and scopeof this disclosure. Furthermore, it is to be understood that thedisclosure is solely defined by the appended claims. It will beunderstood by those with skill in the art that if a specific number ofan introduced claim element is intended, such intent will be explicitlyrecited in the claim, and in the absence of such recitation no suchlimitation is present. For non-limiting example, as an aid tounderstanding, the following appended claims contain usage of theintroductory phrases “at least one” and “one or more” to introduce claimelements. However, the use of such phrases should not be construed toimply that the introduction of a claim element by the indefinitearticles “a” or “an” limits any particular claim containing suchintroduced claim element to disclosures containing only one suchelement, even when the same claim includes the introductory phrases “oneor more” or “at least one” and indefinite articles such as “a” or “an”;the same holds true for the use in the claims of definite articles.

1. A method implemented by an information handling system that includesa memory and a processor, the method comprising: creating a set ofprovision events from a set of provision information, wherein the set ofprovision information is generated from a set of provisioners in processof fulfilling a provision request from a client; generating a provisionchain from the set of provision events in response to detecting afailure of the provision request, wherein the provision chain links theset of provision events based on one or more correlation rules andidentifies at least one isolation point of the failure; and informingthe client of the at least one isolation point of the failure.
 2. Themethod of claim 1 wherein the set of provision events comprises a firstprovision event and a second provision event, the method furthercomprising: identifying a first provisioner as an initiator of the firstprovision event and identifying a second provisioner as a nextprovisioner corresponding to the first provision event; determining thatthe second provisioner is an initiator of a second provision event; andlinking the second provision event to the first provision event inresponse to the determination.
 3. The method of claim 2 wherein at leastone of the one or more correlation rules require the first provisionevent and the second provision event to comprise a matching requestoridentifier corresponding to the client.
 4. The method of claim 2 furthercomprising: wherein the first provision event comprises a firstrequestor identifier corresponding to the client, a first provisioneridentifier corresponding to the first provisioner, and a first instanceidentifier corresponding to the first provisioner; wherein a thirdprovision event from the set of provision events comprises a secondrequestor identifier, a second provisioner identifier, and a secondinstance identifier; and aggregating the first provision event with thethird provision event in response to determining that the firstrequestor identifier matches the second requestor identifier, the firstprovisioner identifier matches the second provision identifier, and thefirst instance identifier matches the second instance identifier.
 5. Themethod of claim 1 further comprising: assigning a set of plugins to theset of provisioners, and capturing, by each of the set of plugins, theset of provision information from its assigned provisioner while inprocess of fulfilling the provision request.
 6. The method of claim 1wherein at least one of the set of provisioners is a serviceprovisioner, a resource provisioner, and an IAM framework.
 7. The methodof claim 1 further comprising: optimizing the set of plugins based onthe created set of provision events to capture a future set of provisioninformation from the set of provisioners; and optimizing the one or morecorrelation rules based on the created set of provision events.
 8. Aninformation handling system comprising: one or more processors; a memorycoupled to at least one of the processors; a set of computer programinstructions stored in the memory and executed by at least one of theprocessors in order to perform actions of: creating a set of provisionevents from a set of provision information, wherein the set of provisioninformation is generated from a set of provisioners in process offulfilling a provision request from a client; generating a provisionchain from the set of provision events in response to detecting afailure of the provision request, wherein the provision chain links theset of provision events based on one or more correlation rules andidentifies at least one isolation point of the failure; and informingthe client of the at least one isolation point of the failure.
 9. Theinformation handling system of claim 8 wherein the set of provisionevents comprises a first provision event and a second provision event,and wherein the processors perform additional actions comprising:identifying a first provisioner as an initiator of the first provisionevent and identifying a second provisioner as a next provisionercorresponding to the first provision event; determining that the secondprovisioner is an initiator of a second provision event; and linking thesecond provision event to the first provision event in response to thedetermination.
 10. The information handling system of claim 9 wherein atleast one of the one or more correlation rules require the firstprovision event and the second provision event to comprise a matchingrequestor identifier corresponding to the client.
 11. The informationhandling system of claim 9 wherein the processors perform additionalactions comprising: wherein the first provision event comprises a firstrequestor identifier corresponding to the client, a first provisioneridentifier corresponding to the first provisioner, and a first instanceidentifier corresponding to the first provisioner; wherein a thirdprovision event from the set of provision events comprises a secondrequestor identifier, a second provisioner identifier, and a secondinstance identifier; and aggregating the first provision event with thethird provision event in response to determining that the firstrequestor identifier matches the second requestor identifier, the firstprovisioner identifier matches the second provision identifier, and thefirst instance identifier matches the second instance identifier. 12.The information handling system of claim 8 wherein the processorsperform additional actions comprising: assigning a set of plugins to theset of provisioners, and capturing, by each of the set of plugins, theset of provision information from its assigned provisioner while inprocess of fulfilling the provision request.
 13. The informationhandling system of claim 8 wherein at least one of the set ofprovisioners is a service provisioner, a resource provisioner, and anIAM framework.
 14. The information handling system of claim 8 whereinthe processors perform additional actions comprising: optimizing the setof plugins based on the created set of provision events to capture afuture set of provision information from the set of provisioners; andoptimizing the one or more correlation rules based on the created set ofprovision events.
 15. A computer program product stored in a computerreadable storage medium, comprising computer program code that, whenexecuted by an information handling system, causes the informationhandling system to perform actions comprising: creating a set ofprovision events from a set of provision information, wherein the set ofprovision information is generated from a set of provisioners in processof fulfilling a provision request from a client; generating a provisionchain from the set of provision events in response to detecting afailure of the provision request, wherein the provision chain links theset of provision events based on one or more correlation rules andidentifies at least one isolation point of the failure; and informingthe client of the at least one isolation point of the failure.
 16. Thecomputer program product of claim 15 wherein the set of provision eventscomprises a first provision event and a second provision event, andwherein the information handling system performs further actionscomprising: identifying a first provisioner as an initiator of the firstprovision event and identifying a second provisioner as a nextprovisioner corresponding to the first provision event; determining thatthe second provisioner is an initiator of a second provision event; andlinking the second provision event to the first provision event inresponse to the determination.
 17. The computer program product of claim16 wherein at least one of the one or more correlation rules require thefirst provision event and the second provision event to comprise amatching requestor identifier corresponding to the client.
 18. Thecomputer program product of claim 16 wherein the information handlingsystem performs further actions comprising: wherein the first provisionevent comprises a first requestor identifier corresponding to theclient, a first provisioner identifier corresponding to the firstprovisioner, and a first instance identifier corresponding to the firstprovisioner; wherein a third provision event from the set of provisionevents comprises a second requestor identifier, a second provisioneridentifier, and a second instance identifier; and aggregating the firstprovision event with the third provision event in response todetermining that the first requestor identifier matches the secondrequestor identifier, the first provisioner identifier matches thesecond provision identifier, and the first instance identifier matchesthe second instance identifier.
 19. The computer program product ofclaim 15 wherein the information handling system performs furtheractions comprising: assigning a set of plugins to the set ofprovisioners, and capturing, by each of the set of plugins, the set ofprovision information from its assigned provisioner while in process offulfilling the provision request.
 20. The computer program product ofclaim 15 wherein the information handling system performs furtheractions comprising: optimizing the set of plugins based on the createdset of provision events to capture a future set of provision informationfrom the set of provisioners; and optimizing the one or more correlationrules based on the created set of provision events.